Reconstructing tongue movements from audio and video

نویسندگان

  • Hedvig Kjellström
  • Olov Engwall
  • Olle Bälter
چکیده

This paper presents an approach to articulatory inversion using audio and video of the user’s face, requiring no special markers. The video is stabilized with respect to the face, and the mouth region cropped out. The mouth image is projected into a learned independent component subspace to obtain a low-dimensional representation of the mouth appearance. The inversion problem is treated as one of regression; a non-linear regressor using relevance vector machines is trained with a dataset of simultaneous images of a subject’s face, acoustic features and positions of magnetic coils glued to the subjects’s tongue. The results show the benefit of using both cues for inversion. We envisage the inversion method to be part of a pronunciation training system with articulatory feedback.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reconstructing the tongue surface from six cross-sectional contours: ultrasound data

This work presents a method for reconstructing 3D tongue surfaces during speech from ultrasound data. The method reduces the dimensionality of the tongue surface and maintains highly accurate reproduction of local deformation features. This modification is an essential step if multiplane tongue movements are to be reconstructed practically into tongue surface movements. Earlier work (Stone & Lu...

متن کامل

Synthesising Tongue Movements for conversing Virtual Humans

Facial motion capture is a common method for accurately capturing realistic facial movements. An actor’s performance can be used to bring a virtual human to life. However, the movement of the tongue is often forgotten in character animation. For the most part, the problem arises from the difficulty in capturing tongue movements due to occlusion by the teeth and the lips. Techniques from traditi...

متن کامل

Oral Motor Indexes in Iranian 4- to 9-month-old infants: A short report

Introduction: Examination of oral motor indices in infants helps treatment team members identify infants who are delayed in these indices earlier and use the order of occurrence of these indices in normal infants as a criterion for treating infants with motor delay. Literature shows that oral movement indices are different in different races, cultures, and languages. The aim of this study was t...

متن کامل

Real vs. rule-generated tongue movements as an audio-visual speech perception support

We have conducted two studies in which animations created from real tongue movements and rule-based synthesis are compared. We first studied if the two types of animations were different in terms of how much support they give in a perception task. Subjects achieved a significantly higher word recognition rate in sentences when animations were shown compared to the audio only condition, and a si...

متن کامل

The sight of your tongue: neural correlates of audio-lingual speech perception

While functional neuroimaging studies demonstrate that multiple cortical regions play a key role in audio-visual integration of speech, whether cross-modal speech interactions only depend on well-known auditory and visuo-facial modalities or, rather, might also be triggered by other sensory sources remains unexplored. The present functional magnetic resonance imaging (fMRI) study examined the n...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006